Statistical Savvy for Librarians: Activity

2025-03-12

Descriptive statistics

Descriptive statistics summarize a dataset, commonly through measures of central tendency or spread

Examples: mean, median, mode, range, standard deviation

Descriptive statistics are usually used to set the stage for an analysis. You might use them to

Summary of Survey Responses

2 people responded

0 people didn’t answer every question 🤓

How many cups of a caffeinated beverage have you had today?

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
      1       1       1       1       1       1 

How many emails have you sent so far today?

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
      1       2       3       3       4       5 

What’s your favorite season?

       Count Percent (%)
spring     1          50
winter     1          50

Descriptive statistics

Datasaurus Dozen GIF: Matejka, J., & Fitzmaurice, G, 2017; Based on Anscombe, 1973

Inferential Statistics

Inferential statistics allow us to draw conclusions based on a dataset based on probability theory

Examples: Chi-squared test, ANOVA, regression

Inferential statistics are usually used to test a hypothesis, for example

Example: Correlation

A correlation describes the relationship between two variables

Statistical Significance

Statistical significance demonstrates the degree to which a result can be attributed to random chance

Low P value (0.5, 0.1) means low likelihood results are due to random chance alone

Example: Regression

Regression analysis is a family of methods to characterize the relationship between an outcome of interest and explanatory variables

- Most common regression method is linear regression

Example: chi-squared test

A chi-squared test is used to analyze if two categorical variables are related - Also called a chi-squared test of independence

The intuition is to see if actual frequencies match up with expected frequencies

Is there a relationship between personality and movie preferences?

         
          doer dreamer
  action     1       1
  comedy     1       2
  romance    3       4
  scary      3       4
  sci-fi     2       1

    Pearson's Chi-squared test

data:  table
X-squared = 0.77698, df = 4, p-value = 0.9415

Statistical significance

:::